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Abstract— Caffeic acid O-methyltransferases (COMTs) are essential enzymes for producing natural 
products in plants, specifically involved in the phenylalanine metabolic pathway and the monolignol 
biosynthetic pathway. These enzymes are responsible for the methylation of caffeic acid compounds, which 
are the building blocks for many plant-derived compounds with various biological activities. The 
investigation of the evolutionary divergence, expression patterns under diverse abiotic stress conditions, 
and lignin content-related features of the COMT gene family in Sorghum has not been explored. In this 
study, forty-eight SbCOMTS were identified in S.bicolor. Based on the examination of evolutionary 
relationships, 48 SbCOMTs were classified into two distinct categories. The gene characterization and the 


conserved motif patterns in each group were similar, demonstrating the reliability of the phylogenetic 
categorization. Chromosomes 5 and 7 have been found as the hotspot of SoCOMTs with 10 and 7 genes 
respectively. Phylogenetic analysis revealed the conservation of Sorghum COMT genes among Zea mays 
and Oryza sativa. Investigation of regulatory elements specifies the significant roles that COMT genes play 
in the monolignol biosynthetic pathway of S. bicolor. Analysis of miRNA, transcription factor binding, and 
gene expression analysis provides insights to further engineer lignin biosynthetic pathway for better biofuel 
yield. We found that two ShCOMTs (SbCOMT26& 36) were highly expressed and their relative contents 
were similar to the variation drift of lignin content under abiotic stress conditions in S. bicolor. These 
results provide a clue for further study on the roles of SoCOMTs in the development of Sorghum and could 
favourably be foundations for the cultivation of Sorghum with higher biomass and yield with enhanced 
abiotic stress tolerance. 


Keywords— Caffeic acid O-methyltransferase, Monolignol biosynthesis, Abiotic stress, Biomass, 
Biofuels. 


Highlights e Cis regulatory analysis, transcription factor 
prediction and miRNA analysis SbCOMTs 
provide insights into manipulation of these genes 
for development of crops for better biofuel yield. 


e Discovery of COMT gene family members in 
Sorghum helps in identification of genes 


responsible for developmental lignification and 
e Тһе genomic location and tissue specific 


expression analysis of Caffeic acid O- 


their involvement in other metabolic process. 
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methyltransferase (COMT) genes under drought 
and salt stress reveal their critical role in 
lignification in Sorghum bicolor. 


I. INTRODUCTION 


O-methyltransferases (OMTs) catalyze a wide range of 
reactions in lignin and flavonoid biosynthesis pathways. 
COMTS are responsible for lignin biosynthesis and are 
involved in  phenyl-alanine metabolism in plants. 
According to previous reports on monolignol production, 
the key methylations of Lignin precursors are primarily 
facilitated by specific S-adenosyl-L-methionine (SAM)- 
dependent enzymes, including caffeoyl CoA 3-0- 
methyltransferase (CCOAOMT; EC 2.1.1.104) and caffeic 
acid O-methyltransferase (COMT; EC 2.1.1.68) (Louie et 
al. 2010). The COMTs are grouped in plant type | of the 
SAM-dependent O-methyltransferases family (Noel et al. 
2003). These enzymes utilize S-adenosyl-methionine as a 
methyl group donor and perform methylation of the 5- 
hydroxyl group of their substrate, 5-hydroxy 
coniferaldehyde, ultimately leading to the production of S- 
lignin units.In Arabidopsis thaliana, COMT may convert 
5-OH  coniferaldehyde/5-OH  coniferyl alcohol into 
sinapaldehyde/sinapyl alcohol and caffeic acid into ferulic 
acid, which results in the synthesis of both G and S units of 
lignin (Goujon et al., 2003). Previous studies on 
Arabidopsis thaliana by Lee et al. (2015) suggest that 
COMT is also essential for the conversion of N-acetyl 
serotonin to melatonin. The COMTs of sorghum can 
methylate flavones such as luteolin and selgin in sorghum 
to aid the synthesis of tricin (Eudes et al. 2017). 


Sorghum (Sorghum bicolor) is one of the primary staple 
grains consumed in India, following rice (Oryza sativa) 
and wheat (Triticum aestivum), and holds the 5th position 
in global cereal production. In addition, it is a promising 
crop for biofuel and a possible source of cellulosic 
feedstock. The estimated size of its diploid genome is 730 
MB, and it has a haploid chromosome number of 10. Plant- 
based renewable biofuels promise sustainable solutions to 
food and energy demands. Sorghum offers the status of a 
highly diverse food, feed, and biofuel source globally. 
Sorghum is a useful crop for almost all renewable energy 
systems that are being developed for green technology and 
renewable fuels. 


Lignin is a polyphenolic polymer enclosed by wood fibres, 
other tube bundle cells, and thick-walled cell walls. The 
three major monolignols, p-coumaryl alcohol, coniferyl 
alcohol, and sinapyl alcohol, yield p-hydroxyphenyl (H), 
guaiacyl (G), and syringyl (S) subunits, respectively. Upon 
polymerization, these three subunits will form rigid and 
complex lignin in plants. The composition of these 
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subunits will regulate the physical properties and 
digestibility of lignin (Baucher et al., 2003). Bugos et al. 
(1991) reported the first exploration of the COMT gene 
family in Populus tremuloides. Later, the COMT gene 
family is uncovered in several species, which include 
seven COMTs in Eucalyptus grandis (Carocha et al., 
2015), Catalpa bungei comprises 23 COMTS (Lu et al., 
2019), 92 COMT members found in blueberries (Liu et al., 
2021), Populus trichocarpa (Chiang et al., 2010), Brassica 
rapa L. (Wei et al., 2016), and Betula pendula (Chen et al., 
2020) harbours 25 COMT candidates and Soybean contains 
55 COMTS (Zhang et al., 2021). In plants, COMT regulates 
responses to a variety of stresses, including drought (Yao et 
al., 2022), salt (Chang et al.,2021), cold (Zhang et al., 
2021), and phytohormone signaling. 


In the present study, identification of COMT homologs, 
gene structure, gene characterizations, chromosomal 
locations, evolutionary relationships, conserved motifs 
analysis, cellular localization, promoter analysis, protein 
modeling, protein-protein interactions, miRNA prediction, 
transcription factor prediction, and expression patterns was 
mined in S. bicolor. These findings would help in the 
manipulation of the lignin biosynthetic pathway for better 
biofuel yield and breeding Sorghum cultivars with 
enhanced abiotic and biotic stress tolerance. 


П. MATERIALS AND METHODS 
Plant material and induction of stress 


Seeds of the Sorghum bicolorhigh biomass variety (IS 
4698) were obtained from the Indian Institute of Millet 
Research (IIMR), Rajendranagar, Hyderabad, and sown in 
pots filled with 4 kg of black soil at the Departmental 
Farm, Department of Genetics, Osmania University, 
Hyderabad (India). Seedlings were raised in a glass house 
environment at 28—209C day/night temperatures. Sixty- 
five-day-old seedlings were treated with 200 mM NaCl 
solution and 200 mM Mannitol solution for 48 hours each. 
Under comparable conditions, corresponding controls were 
kept well-watered and without any treatment. Various plant 
tissues, such as leaves, stems, and roots, were collected 
from both the treated and control groups. These tissues 
were then snap-frozen in liquid nitrogen and preserved at - 
80°С for future use. Three technical and three biological 
replicates were employed for the qRT-PCR study. 


In silico prediction, identification, and characterization 
of SDCOMT genes 


For the identification of the COMT gene family in 
Sorghum, the protein sequences of the Sorghum bicolor 
genome were retrieved Нот Ше Phytozome 
(http://www.phytozome.net/) plant database to use as the 
Previously characterized 


local protein database. 
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Arabidopsis COMT genes were used to perform a BLASTP 
search against the local protein database with a threshold 
of E-value « le-5. The PFAM profile was used as the 
query to search against the local protein database using 
HMMER 3.0 with a threshold of E-value < le-5. Based on 
the results of HMMER and BLASTP, the redundant 
sequences were removed. Then, the putative Sorghum 
COMT genes were retrieved Нот PFAM databases 
(http://pfam.xfam.org/search) to predict the conserved 
protein domain, and those containing a complete COMT 
domain remained as candidates. 


Gene structure prediction, conserved motif analysis, 
sub-cellular localization, and protein parameters 


The prediction of gene structure was carried out using GTF 
annotation files using ТВ tools. For the prediction of 
subcellular localization of the proteins, WoLFPSORT was 
used. Parameters like isoelectric point (pI), molecular 
weight (MW), GRAVY (grand average of hydropathy), 
instability, and aliphatic indexes in ProtParam software 
were employed. For conserved motifs, MEME with 
parameters like 10 numbers of motifs, 2-20 motif sites, 
and 6-20 wide motif widths were used. The genes on the 
chromosomes were mapped based on their physical 
location using an online phenogram tool. 


Phylogenetic analysis, multiple sequence alignment, 
generation of Synteny maps, and Ka/Ks analysis 


A phylogenetic tree was developed using the MEGAv10 
program, employing the Neighbour-Joining algorithm 
(Kumar et al. 2018) with 1000 bootstrap samples based on 
the amino acid sequences of Sorghum bicolor (Sb), Oryza 
sativa(Os), Zea mays (Zm), and Arabidopsis thaliana (At). 
SbCOMT 26 and 36 proteins were aligned with orthologs 
in the above-mentioned species using MEGAv10.0. АП the 
predicted ЅЪСОМТ homologs were mapped on Z. mays 
and O. sativa genomes, and synteny maps were generated 
with TB tools (Chen et al. 2020). The synonymous to non- 
synonymous ratios and time of evolution (MYA) of the 
SbCOMT paralog pair were calculated by an online Ka/Ks 
calculator. 


Prediction of cis-elements, protein modelling, and 
protein-protein interactions 


Promoter elements were identified for all the ЗЬСОМТ 
genes from the Phytozome database, and the 1500-bp 
sequence upstream for all the Sorghum COMT homologs 
was extracted and submitted to the Plant CARE database 
(http://bioinformatics.psb.ugent.be/webtools/plantcare/html 
/) to predict cis-elements. The 3D structures of all the 
SbCOMT proteins were predicted using the SWISS- 
MODEL server (https://swissmodel.expasy.org/) (Biasini. 
et al. 2014). The predicted 3D structures of proteins were 
evaluated for stability using the Protein Structure 


ISSN: 2456-1878 (Int. J. Environ. Agric. Biotech.) 
https://dx.doi.org/10.22161/ijeab.94.37 


Genome-Wide Identification, Characterization, and Expression analysis of the Caffeic Acid O-Methyl 


Verification Server (PSVS) (https://saves.mbi.ucla.edu/) 
and Ramachandran plots. The predicted protein-protein 
interaction (PPI) map of Sorghum COMT homologs was 
generated from the STRING database (https://string- 
db.org/). 


miRNA and Transcription factor analysis 


We predicted miRNAs that might target SbCOMTs to 
control their expression using the Plant psRNA Target tool 
(https://www.zhaolab.org/psRNATarget/) ^ with default 
parameters, and all of the Sorghum miRNAs were used. 
The regulatory network of the SoOCOMT gene and miRNA 
was visualized using Cytoscape (https://cytoscape.org/). 
Transcription factor binding sites of all SbCOMT 
homologs were predicted by the Plant transcription factor 
database (PTFDB) (http://planttfdb.gao-lab.org/) and a 
network built using Cytoscape. 


In silico expression analysis of SoCOMT genes 


The transcriptome data (FPKM) of Sorghum was 
downloaded from the Gramene 
(https://www.gramene.org/) The transcriptome data 
include baseline expression of ЗЬСОМТ$ in various organs 


database 


of Sorghum (Davidson et al. 2012), vascular and non- 
vascular tissue (Turco et al. 2017), stem internodes of 
bioenergy Sorghum (Kebrom et al. 2017), and expression 
patterns in leaf and root tissue under drought conditions 
(Varoquaux et al. 2019). The expression patterns were 
visualized by a heat map built with TB tools. 
Transcriptome (FPKM) data of S. bicolor under osmotic 
stress and ABA stress (Acc: SRP007361) (Dugas et al. 
2011) mined from the Morokoshi Sorghum transcriptome 
database (http://sorghum.riken.jp). 


Expression analysis of ShCOMT genes by КТ PCR 


Total RNA was extracted from stress-exposed and control 
(without stress) plants using the Trizol reagent method. 
The purity of the RNA was determined using an Eppendorf 
Bio photometer. One microgram of RNA was used as a 
template for first-strand cDNA synthesis with the 
PrimeScriptTM RT Reagent Kit (Takara, Japan) according 
to the manufacturer's instructions. 2X SYBR Premix Ex 
Tag (Tli RNaseH Plus, Takara, Japan) Master Mix with 
gene-specific primers (Table 5) was used to determine the 
relative gene expression levels of SoOCOMTs. 


Thermal cycling conditions of 95°C for 2 min, followed by 
40 cycles of 95 °C for 30 $, 58 °C for 30 s, and 72 °C for 30 
s, were programmed in the ABI 7500 real-time PCR 
system (Applied Biosystems, Foster City, CA) for qRT- 
PCR analysis. ЅЪСОМТ gene expression in both treated 
and control samples was normalized using the EIF4a 
(Eukaryotic Initiation Factor 4A) reference gene. For each 
sample, qRT-PCR was performed using three biological 
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and three technical replicates. The relative amounts (fold 
change) of each transcript were calculated using the 
comparative 2^AACT method. 


III. RESULTS 


Identification and Characterization of SbCOMT 
homologs in Sorghum bicolor 


A total of 48 potential sequences were obtained from the 
Sorghum genome. Then, all 48 candidate sequences were 
scanned for a  methyltransf 2 domain. Forty-eight 
sequences with a methyltransf 2 domain (Fig. 1.4) were 
identified in the Sorghum diploid genome. All of them 
were mapped on pseudochromosomes and renamed from 
SbCOMTI to SbCOMT48 (Table 1). The protein 
parameters (Table 2) and gene structural characteristics 
were analyzed (Fig. 1.5). The result showed that 
SbCOMTA3 was the shortest protein (100 amino acids), 
and the longest one was SoCOMT9. 


The molecular weight of 48 SbCOMT proteins ranged 
from 39 to 47 kDa, and the isoelectric point ranged from 
4.65 to 7.13. The investigation of the conserved domain 
and gene structure indicated that all COMT genes 
possessed a catalytic domain at the C-terminus, which was 
referred to as the Methyltransf 2 domain, encompassing a 
binding pocket for SAM/SAH and the AdoMet-MTase 
superfamily domain. Some of them showed a common 
structure with an N-terminal domain called dimerization. 
The binding pocket for SAM/SAH exhibited significant 
conservation, whereas the binding sites for substrates were 
distinct for proteins belonging to diverse groups. $ЪСОМТ 
32, 34, 37, and 38 contain 3:4 of introns: exons; 
SbCOMT9, 26; and COMT47 consist of two introns and 
three exons; and ЗЬСОМТ 30, 40, and 41 have only one 
exon without introns. The patterns of the methyltransf 2 
domain in SbCOMTs were similar in the same group. 


The structural differences in protein sequences across the 
Sorghum COMTs were assessed using the Multiple 
Expectation Maximisation for Motif Elicitation (MEME) 
online tools (Fig. 1.c). A total of 10 motifs were found in 
the sorghum COMT proteins. Most of the motifs were the 
same in the two groups, and they were in the same order in 
COMT proteins within the same group. The consensus 
motif 1 (Methyl transferase-2) and motif 2 (AdoMet- 
МТазе) are found in all SbBCOMTS. 


Most of the ЗЬСОМТ gene homologs are localized in the 
cytoplasm, followed by the chloroplast, plasma membrane, 
and mitochondria. Only SbCOMT39 is present in 
mitochondria; SbCOMT9, SbCOMTIO, SbCOMT15, 
SbCOMT16, SbCOMT23, SbCOMT40, SbCOMT41, 
SbCOMT44, and SbCOMTA5 are localized on the plasma 
membrane; SbCOMTS, SbCOMT24, SbCOMT26, 
SbCOMT27, SbCOMT32, SbCOMT34, SbCOMT35, 
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SbCOMT37, SbCOMT38, SbCOMT43, SbCOMT46, 
SbCOMT47, and SbCOMT48 are localized in the 
cytoplasm; and the rest of the SbBCOMT homologs are 
found in the cytoplasm (Table 1). 


Phylogenetic analysis, multiple sequence alignment, 
chromosomal location, synteny, and Ka/Ks analysis of 
SbCOMTS 


The phylogenetic tree analysis revealed the evolutionary 
connection of SbCOMT homologs in Sorghum bicolor 
with Oryza sativa, Zea mays, and Arabidopsis thaliana 
(Fig. 2.a). A total of 15 SbCOMT paralogs were identified. 
A neighbourhood joining (NJ) phylogenetic tree created 
with Sorghum COMT protein sequences showed that the 
proteins were distributed into two groups. 


Sorghum showed 12 ortholog pairs, 11 with Zea mays 
(SbCOMT9 and ZmCOMT18, SbCOMTIO and 
ZmCOMTI2, SbCOMTII, and ZmCOMT20, SbBCOMTI6 
and ZmCOMT29, SbCOMTI7 and ZmCOMTS and 
SbCOMT20 and ZmCOMT6,  SbCONT26, and 
ZmCOMT22, SbCOMT30 and ZmCOMT31, ЅЪСОМТЗІ 
and ZmCOMT4, SbCOMT36 and ZmCOMT28, 
SbCOMT43 and ZmCOMTI17) and 1 with Oryza 
(SbCOMTS5 and OsCOMT16). The location of SoOCOMT 
homologs was mapped on the Sorghum genome (Fig. 2.b). 
SbCOMT genes are scattered on all 10 chromosomes of 
Sorghum. Synonymous to non-synonymous substitution 
rates (Ka/Ks) of 9 Sorghum paralogs (ЅЪСОМТІ and 
SbCOMT2, SbCOMT3 and SbCOMT4, SbCOMTS, and 
SbCOMTI2, SbCOMT14 and SbCOMT20, SbCOMT18 
and SbCOMT21, SbCOMT19 and  SbCOMT22, 
SbCOMT23 and  SbCOMT24, SbCOMT28 and 
SbCOMT30, SbCOMT32 and SbCOMT34) were 
calculated (Table 3). 


Multiple sequence alignments of ЗЬСОМТ?26 and 36 with 
orthologs in other species displayed highly conserved 
residues, which indicates these genes are conserved among 
species. All Sorghum paralogs showed substitution rates 
<1. The lowest Ka/Ks (0.05735056) was observed in gene 
pairs ЗЬСОМТ18 and SbCOMT21, and the highest Ka/Ks 
were observed in gene pairs SoOCOMT32 and SboCOMT34, 
respectively. The selection pressures on the COMTS in S. 
bicolor were explored based on the Ka/Ks ratios. This 
investigation revealed that the Ka/Ks ratios of SBCOMT 
paralogs are <l, which indicates that SbCOMTs 
experienced purifying selection during evolution. 


Collinearity analysis of ЗЬСОМТ homologs has been 
performed on the genomes of Zea mays and Oryza sativa 
(Figs. 3a and b). S. bicolor chromosomes 1, 5, and 9 
display 3, 3, and 2 homologs each with the Zea mays 
genome, respectively. S. bicolor chromosome 1 shows 2 
homologs with the Zea mays 9 chromosome and 1 
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homolog with the Zea mays 10 chromosome; 5. bicolor 
chromosome 4 displays 1 homolog on the Zea mays 4 
chromosome; 5. bicolor chromosome 5 displays 3 
homologs on the Zea mays genome, 2 homologs on the 4th, 
and 1 homolog on the 2nd chromosome; and S. bicolor 
chromosome 9 shows two homologs with Zea mays 
chromosomes 6 and 8, respectively. Pink-coloured links 
represent homologs between two genomes. The S. bicolor 
genome shows a total of 6 homologs in the O. sativa 
genome. S. bicolor chromosome 4 shows one homolog 
with O. sativa chromosome 2, and S. bicolor chromosome 
3 displays 1 homolog with O. sativa chromosomes 1 and 5, 
and S. bicolor chromosome 5 shows 1 homolog with O. 
sativa chromosome 1. Sb chromosome 7 displays 1 
homolog with O. sativa chromosome 8 and Sb 
chromosome 9 shows 1 homolog with O. sativa 
chromosome 5. 


Cis-regulatory elements analysis of SbCOMTs 


The initiation of transcription is a pivotal phase in gene 
expression, representing a critical juncture where RNA 
polymerase interacts with regulatory sequences like the 
promoter, which ultimately impacts the gene expression 
level. (Liu et al. 2019). Promoter analysis of SbCOMT 
homologs revealed the occurrence of lignin biosynthesis, 
abiotic stress, light-responsive, and  phytohormone- 
responsive putative cis-regulatory elements (Fig. 4). 
Different elements like  defense-responsive, wound- 
responsive, MYB-drought-responsive, MYB-light- 
responsive, and  MYB-flavonoid  genes-related cis- 
regulatory elements are found in the promoter regions of 
SbCOMT genes. MYB and NAC represented the highest 
number of elements in all the SbCOMT homologs, 
indicating their involvement in lignin biosynthesis and 
stress tolerance. SbCOMT homologs contain defense- 
responsive elements, indicating their involvement in biotic 
stress-related defense. Most COMT homologs have 
phytohormone-responsive elements like ABRE, MeJARE, 
GARE, SARE, and AURE. MeJARE and SARE, the 
defense-responsive elements, have been found to have the 
highest number of elements among the phytohormone- 
responsive elements and have been identified in all the 
SbCOMT homologs, indicating their involvement in 
defense mechanisms. Light-responsive elements are also 
found in the promoter regions of ЗЬСОМТ homologs. This 
finding indicates that COMT genes in S. bicolor may be 
regulated by light. Similar cis-regulatory elements within 
homologs may significantly influence similarities among 
gene expression patterns and gene roles. A large majority 
of SbCOMTs had ABRE, related to abscisic acid, and 
MeJRE, related to methyl jasmonate. 


3D structures and PPI analysis of SbCOMTSs 
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3D structures of ЗЬСОМТ proteins were predicted with the 
best PDB templates (Fig. 5.a). The template PDB ID, 
chain, model of the oligomer, and their structure 
validations are represented in Table. 3D structures of 
SbCOMT36 displayed 100% identity with the Caffeic acid- 
O-Methyltransferase of S. bicolor (PDB ID-4pgg.1. A) 
protein, and ЗЬСОМТЗ8 showed 100% identity with the 
Stilbene-O-Methyltransferase protein (PDB ID-7vb8.1. A). 
The rest of the ЅЪСОМТ homologs displayed identity, with 
corresponding templates ranging from 31% to 66%. In the 
predicted PPI map, one of the putatively expressed and 
characterized SoCOMT (36) proteins exhibited interactions 
with several proteins (Fig. 5.b). SBCOMT36 protein shows 
11 nodes with 38 edges with other proteins. Each protein 
showed more than one interactant. The proteins that 
display interactions with ЅЪСОМТ (Sb07g003860.1) are 
CAD  (Sb04g005950.1) terminal gene in  Lignin 
biosynthesis; phenylalanine ammonia-lyase 
(Sb04g026510.1, Sb04g026520.1, Sb01g014020.1, 
Sb06g022750.1, Sb06g022740.1) is involved in the L- 


phenylalanine catabolic process, phenylpropanoid 
biosynthetic process, and phenylpropanoid metabolic 
process. Probable 4-coumarate-coA ligase 1 


(Sb07g007810.1) is involved in the early stages of lignin 
biosynthesis; F5H (Sb02g002630.1) is involved in the 
conversion of coniferaldehyde to sinapaldehyde in lignin 
biosynthesis; and Folylpolyglutamate synthase 
(Sb01g049840.1) is involved in purine, pyrimidine, and 
amino acid synthesis. 


miRNA and Transcription factor binding site prediction 


Additionally, we predicted the miRNAs that might target 
SbCOMTSs to regulate their expression. In total, 19 
SbCOMT genes were found to be targeted by 31 miRNAs, 
and miRNA-SbCOMT interactions were constructed (Fig. 
6.3). Combined with the miRNA-SbCOMT relationship 
and co-regulation modules of SbCOMTS, which provide 
some insights into the regulation of SbCOMTS expression 
to control lignin biosynthesis, in silico analysis of 
SbCOMTs revealed the presence of numerous cis-elements 
that may assist as binding sites for transcription factors 
with vital functions in lignin biosynthesis. To further 
determine this, Plant TFDB (Jin et al. 2014) was used to 
attain comparative models of transcription factors binding 
on regulatory regions of SbCOMTs. The model displays 
interactions with various transcription factors such as Dof, 
LFY, BES1, MYB-related, E2E, HSF, ТСР, ARF, ЕВЕ 
MICK-MADS, SBP, NAC, MYB, and LBD (Fig. 6.b). 
Plants comprise an MYB sub-family protein that is 
characterized by the R2R3-type MYB domain, which plays 
the role of master regulatory switch in secondary cell wall 
biosynthesis (McCarthy et al. 2009; Zhong et al. 2012; 
Kim et al. 2019). They might also directly activate some 
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lignin genes through the secondary wall MYB-responsive 
element (SMRE) binding site (consensus motif ACC(A/T) 
A(A/C) (T/C)) in the promoter region (Zhong et al. 2012). 
MYB transcription factors function specifically in the 
regulation of lignin biosynthesis (Stracke et al. 2001). 


The NAC family of transcription factors is composed of a 
vast array of proteins. NAC transcription factors were 
found to contribute to plant responses to pathogens, viral 
infections, and environmental stimuli such as drought and 
salinity conditions (Xie et al., 1999; Ren et al., 2000; 
Collinge et al., 2001; Kim et al., 2007). Certain NAC 
transcription factors have been identified as playing a 
crucial role in controlling cell aging, proliferation, and the 
development of wood. (Takada et al., 2001; Vroemen et al., 
2003; Weir et al., 2004; Zhong et al., 2006; Kim et al., 
2007; Yamaguchi et al., 2008). 


In silico expression analysis of SbBCOMTs 


The transcriptome data (FPKM) of SbCOMT genes were 
analyzed to determine the expression patterns of these 
genes under natural habitat and in drought, osmotic, and 
Abscisic acid stress conditions in various tissues and 
organs of S. bicolor. We predicted the expression patterns 
of SbCOMT homologs in different regions of the stem 
internodes of bioenergy sorghum (Fig. 7.а). Among all 
SbCOMTs, SbCOMT36 displayed the highest level of 
expression patterns in all regions of the stem internodes of 
bioenergy sorghum. SbBCOMTI1 and 7 showed the highest 
expression patterns in internode regions 2 and 3. The 
investigation of expression patterns of SbCOMTs in 
vascular and non-vascular tissues of sorghum (Fig. 7.b) 
revealed that ЗЬСОМТЬ, 23, 25, 26, and 27 exhibited the 
highest expression patterns and the rest of the genes 
expressed in the medium to very low range. In the baseline 
expression analysis (Fig. 7.c), at the embryonic stage, 
SbCOMT26 showed а medium expression level, 
SbCOMT34 and 30 displayed a low level of expression, 
and the rest of the genes did not show significant 
expression patterns. At the flowering stage, SoCOMT36 
displayed the highest level of expression, and SoCOMT33 
and 17 showed a medium range of expression. In the floral 
meristem, only SoCOMT26 showed a medium level of 
expression; none of the SbCOMT homologs displayed 
significant expression patterns. In meristematic tissue, only 
SbCOMT26 and 36 showed a medium-range expression 
pattern. In the shoot, ЅЪСОМТ 17 and 36 are highly 
expressed, and SbCOMT 19, 11, and 26 exhibit an average 
range of expression. In root tissue, SoCOMT36 and 20 
exhibited the highest expression patterns, followed by 
SbCOMT17 and 3, which displayed the second-highest 
expression patterns. 
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In the floral meristem, only SBCOMT26 showed a medium 
level of expression; none of the SbCOMT homologs 
displayed significant expression patterns. In meristematic 
tissue, only ЗЬСОМТ?26 and 36 showed a medium-range 
expression pattern. In the shoot, ЗЬСОМТ 17 and 36 are 
highly expressed, and SbCOMT 19, 11, and 26 exhibit an 
average range of expression. In root tissue, SoCOMT36 
and 20 exhibited the highest expression patterns, followed 
by SbCOMT17 and 3, which displayed the second-highest 
expression patterns. We analyzed the expression patterns of 
SbCOMTS in leaf and root tissues under drought conditions 
at different growth intervals (Fig. 7.d). At 42 days of 
growth in leaf tissue under drought stress, expression 
levels of ЗЬСОМТ 17, 25, 26, and 30 are highly up- 
regulated. The expression of SbCOMT 12, 29, and 33 is 
up-regulated in leaf tissue after 63 days of growth. At 77 
days of growth, ЗЬСОМТЬ, 25, and 26 were up-regulated, 
and only a few SbCOMT homologs were down-regulated 
in leaf tissue under drought conditions. 


None of the SbCOMT genes exhibited significant 
expression patterns in the rest of the growth stages of 
sorghum in leaf tissue during drought stress responses. 
When compared with root tissue under drought conditions, 
SbCOMTI, 2, 20, 21, 37, 38, and 39 displayed a low range 
of expression and also exhibited constant expression in all 
stages of growth. The expression of ЅЪСОМТ25 is up- 
regulated in root tissue at 35 days and 77 days under 
drought, and the expression of SbCOMT6 and 36 
displayed the highest level of expression at 77 days of 
growth under the drought stress response. 


Additionally, we explored the expression patterns of 
SbCOMT homologs in 5. bicolor shoot and root tissue 
under osmotic (NaOH), ABA, and PEG stress conditions 
(Fig. 7.е). SbCOMT-36 was highly up-regulated in root 
tissue under osmotic stress conditions and displayed 
moderate expression in root tissue under ABA and PEG 
stress conditions also, SbCOMT-36 shows moderate 
expression patterns in shoot tissue, almost in the above- 
mentioned stress conditions. SOCOMT-20 and SbCOMT- 
17 displayed medium expression patterns in root tissue 
under osmotic stress conditions and shoot tissue under 
PEG treatment, respectively. 


qRT PCR analysis of SoCOMT genes 


Based on in silico transcriptome analysis, SoOCOMT26 and 
36 were considered for qRT-PCR analysis in different 
tissues and organs of Sorghum bicolor under control, 
drought, and salinity stress conditions due to their 
expression in major lignifying organs of sorghum. 
SbCOMT homologs showed variable gene expression 
across different tissues (Fig. 8). ЅЪСОМТ26 expression 
was significantly higher during drought stress compared to 
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salt stress. Among the stress treatments, drought-stressed 
leaves and salt-stressed stem tissues showed a 12.47-fold 
and 11.75-fold rise in transcript levels of SbCOMT26, 
respectively. However, there was no significant increase in 
SbCOMT26 expression under drought and salt stress 
conditions in other sorghum tissues. Whereas, SOCOMT36 
expression was found to be higher in salt stress than in 
drought stress conditions. SbCOMT36 expression was 
significantly increased in salt-stressed stems. Under salt- 
stress conditions, there was а 32-fold rise in SOCOMT36 
transcript levels in shoots and a 3-fold rise in drought- 
stressed roots. 


IV. DISCUSSION 


Caffeic acid O-methyltransferases (COMTS) are essential 
enzymes that contribute significantly to the synthesis of 
lignin and the phenylalanine metabolic pathway in plants. 
Frequently, attempts are made to manipulate the lignin 
makeup of genetically modified crops to enhance their 
digestibility as forage, effectiveness in pulping, and the 
production of biofuels. L-phenylalanine serves as the 
preliminary material for the synthesis of monolignols. 
According to the current understanding of monolignol 
production, the essential O-methylations of hydroxyl 
groups on the phenolic ring of monolignol precursors are 
primarily facilitated by specific S-adenosyl-L-methionine 
(SAM)-dependent enzymes, including caffeoyl CoA 3-O- 
methyltransferase (CCOAOMT; EC 2.1.1.104) and caffeic 
acid O-methyltransferase (COMT; ЕС 2.1.1.68). (Louie et 
al. 2010). The COMTS are classified in the plant type-1 
family of SAM-dependent O-methyltransferases (Noel et 
al. 2003). Sorghum caffeic acid O-methyltransferase uses 
S-adenosyl-methionine as a donor of methyl groups and 
performs methylation of the 5-hydroxyl group of its 
favored substrate, 5-hydroxyconiferaldehyde, ultimately 
leading to the production of S-lignin units. О- 
methyltransferases (OMTs) are responsible for a variety of 
versatile reactions in the biosynthesis pathways of lignin 
and flavonoids. 


Since COMTs may respond to a variety of substrates, 
including phenylpropanoids, flavonoids, and alkaloids, 
they are likely to respond to a variety of stimuli. As a 
result, they are ubiquitous in plants due to their 
significance in helping plants adapt to their environment 
and challenging circumstances (Nomura et al., 2010). The 
publication of diverse plant genomes has allowed analyses 
of COMT family genes in several species to be carried out 
(Barakat et al., 2011; Wu et al., 2013; Liu et al., 2016) and 
is majorly involved in lignin biosynthesis as lignin 
provides mechanical strength to plants. Sorghum bicolor 
has been extensively studied for its large amounts of 
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flavonoids, primarily in food crops, forage, and biofuel 
crops. The Sorghum v3 genome was released in 2017, and 
48 COMTs have been identified, named SbCOMTI- 
SbCOMTAS. Subsequently, many homologs have been 
detected in plants. We need to understand which of the 
homologs performs the crucial processes of plant growth, 
flavonoid metabolism, phenylalanine metabolism, stress 
tolerance, and lignin biosynthesis. Four plants, including 
Sorghum bicolor, Zea mays, Oryza sativa, and Arabidopsis 
thaliana, were examined in this study, and each was shown 
to have a distinct number of COMTs. COMTS in all these 
plants comprise the conserved methyltransferase-2 and 
dimerization domains. Furthermore, we found that the 
number of СОМТ in S. bicolor is greater than that in О. 
sativa, Zea mays, and Arabidopsis thaliana. Oryza sativa 
contains the second-highest number of COMTs (39 COMT 
homologs), followed by Zea mays (32 COMT homologs). 
Arabidopsis thaliana comprises the least number of 
COMTS (17 COMT homologs) of all the studied species. 
The conserved domains of identified ЗЬСОМТ homologs, 
i.e., methyltransferase-2 and dimerization domains, 
correlate with those of other plants (Liu et al., 2021). 


Lignin is the key component of vascular tissue and 
provides plants with structural support to stand upright. 
COMTs are important enzymes involved in lignin 
biosynthesis that catalyze the methylation of S-lignin 
monomers. Evolutionary analysis suggests that these 48 
SbCOMTs are grouped into two clades denoted as Group 
Ia, Group Ib, and Group II. ЗЬСОМТ homologs were more 
closely related to О. sativa COMTS than Arabidopsis 
thaliana. All the identified SoCOMT proteins comprise the 
conserved Methyl transferase-2 (PF00891) domain, which 
has 207 amino acid residues, including a SAM/SAH 
binding pocket and a substrate-binding site, and the 
Dimerization domain (PF08100), which contains 52 amino 
acid residues. All the discovered SbCOMT homologs 
displayed conserved AdoMet-MTase superfamily domains. 


In the present research, all the identified Sorghum 
homologs comprised conserved domains such as methyl 
transferase-2 and dimerization domains, and they were 
involved in numerous functions. About 20-30% of 
SbCOMT homologs belong to the Iso flavone-O- 
Methyltransferase family. These methyltransferases were 
involved in secondary metabolite biosynthesis and iso- 
flavonoid biosynthesis (BRENDA: EC2.1.1.46). Some of 
them SbCOMTS belong to the ZPR3 and ZPR4 families, 
which encode  O-methyltransferase and might Бе 
complicated in suberin biosynthesis (Held et al. 1993). 
SbCOMTS, belonging to  Trans-resveratrol — di-O- 
methyltransferase and Resveratrol O-methyltransferase, 
plays vital roles in biotic (Sambangi et al. 2016) and 
abiotic stress responses (Chiron et al. 2000). ЅЪСОМТ 34, 
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37, and 38 functionally belonged to the iso-eugenol O- 
methyltransferase family and regulated the biosynthesis of 
secondary metabolites and phenylpropanoid biosynthesis 
(BRENDA: EC2.1.1.146). 


Collinearity analysis of SbCOMTs with other species, 
including O. sativa and Z mays, revealed varied 
collinearity with each species. One of the ЅЪСОМТ genes 
shows two collinearity blocks on the Z. mays genome. This 
observation suggests that Z. mays has undergone two 
rounds of whole-genome duplication. WGD and TD are 
the key forces behind gene expansion in Populus (Chiang 
et al., 2010; Barakat et al., 2011). COMTs of maize, rice, 
and foxtail millet have similar gene copy numbers (Liu et 
al. 2019). The SbCOMT homolog gene pairs had Ka/Ks 
ratios of «1, indicating that ће SoCOMTs had undergone 
significant purifying selection. The cis-regulatory elements 
existing in the promoter regions were the binding sites of 
the COMTs gene with other proteins, which play an 
essential role in regulating gene transcription. There were a 
huge number of light-responsive elements, phytohormone- 
responsive elements, which involve plant defense 
mechanisms and growth, drought  stress-responsive 
elements, and regulatory elements that promote lignin 
synthesis (Sega et al., 2020). 


COMT expression was upregulated in plants when stressed 
or exposed to hormones. (Asif et al., 2014; Zhang et al., 
2015; Li et al., 2016; FU et al., 2019). According to in 
silico expression analysis, ЅЪСОМТ26 апа SbCOMT36 
are highly expressed in all tissues of sorghum under natural 
conditions and also in leaf and root tissues under drought 
stress responses at different growth intervals. This study 
demonstrates the relationship between SbCOMTS, a crucial 
enzyme in the biosynthesis of monolignol, and the methods 
by which sorghum adapts to drought stress. Zhang et al. 
(2021) reported that the COMT gene family plays a 
significant role in plant defense to abiotic stress and 
lignification under drought conditions. Under drought 
conditions, lignin concentration increased considerably in 
the stems of Eucalyptus urograndis and Eucalyptus 
globulus (Moura-Sobczak et al. 2011). SbCOMTs are 
implicated in salt stress responses. Under salt stress, the 
contents of S and G units of lignin are raised in Coffea 
arabica (de Lima et al. 2014). 


In many abiotic stress conditions, such as drought stress, 
the majority of the genes of the monolignol biosynthesis 
pathway are usually upregulated. This helps plants resist 
water loss by fortifying their cell walls. We also found that 
SbCOMT26 and 36 are highly stimulated under drought 
and salt stress; hence, these are potential targets for 
manipulation of lignin biosynthesis in sorghum to engineer 
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biomass for better biofuel yield and enhanced abiotic stress 
tolerance. 


CONCLUSION 


In the present research, we identified COMT48 genes from 
Sorghum bicolor. Based on a phylogenetic investigation of 
COMTs, we divided the COMTs into two groups, which 
specified the existence of two ancestor genes. Gene 
characterization, conserved domains, motif identification, 
localization, and phylogenetic analysis revealed a close 
relationship between Sorghum bicolor COMT gene 
homologs and its relative Oryza sativa and Zea mays. The 
Ka/Ks ratios for the COMTs from Sorghum were less than 
one, indicating that the COMTs have undergone strong 
purifying selection. Identification of cis-acting elements 
and transcription factor prediction would be helpful to 
explore further and manipulate SoCOMT genes to design 
better biofuel crops. Те miRNA prediction and 
elucidation of expression patterns under diverse abiotic 
stress conditions would help in the regulation of SBCOMT 
genes to engineer the lignin composition, further 
improving the biomass and enhancing abiotic stress 
tolerance in Sorghum bicolor. 
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Table 1: Characterization of SBCOMT homologs. 


Gene Name Transcript ID Chr | CDS bp | Introns/Exons А.а Domain Mw (kDa) Localization 
>SbCOMT-1 Sobic.010G230800.1 10 1125 01:02 375 Methyl transferase-2 41.366 Cytoplasmic 
>SbCOMT-2 Sobic.010G23 1000.2 10 1287 01:02 429 Methyl transferase-2 47.265 Cytoplasmic 
>SbCOMT-3 Sobic.010G234500.1 10 1149 01:02 383 Methyl transferase-2 42.007 Cytoplasmic 
>SbCOMT-4 Sobic.010G234400.1 10 1152 01:02 384 Methyl transferase-2 42.131 Cytoplasmic 
»SbCOMT-5 Sobic.003G298500.1 3 1152 01:02 384 Methyl transferase-2 41.293 Chloroplast 
>SbCOMT-6 Sobic.009G197600.4 9 1089 01:02 363 Methyl transferase-2 39.438 Cytoplasmic 
>SbCOMT-7 Sobic.009G197600.5 9 1089 01:02 363 Methyl transferase-2 39.438 Cytoplasmic 
>SbCOMT-8 Sobic.009G197800. 1 9 711 01:02 237 Methyl transferase-2 25.973 Cytoplasmic 
>SbCOMT-9 Sobic.009G043900. 1 9 1299 02:03 433 Methyl transferase-2 47.186 Plasma Membrane 
>SbCOMT-10 Sobic.009G197400.1 9 1086 01:02 362 Methyl transferase-2 38.919 Plasma Membrane 
>SbCOMT-11 Sobic.009G197000.1 9 1068 01:02 356 Methyl transferase-2 38.689 Cytoplasmic 
>SbCOMT-12 Sobic.009G198000.1 9 1080 01:02 360 Methyl transferase-2 39.136 Cytoplasmic 
>SbCOMT-13 Sobic.005G129100.1 Э 1140 01:02 380 Methyl transferase-2 40.75 Cytoplasmic 
>SbCOMT-14 Sobic.005G110451.1 a 690 01:02 230 Methyl transferase-2 25.578 Cytoplasmic 
>SbCOMT-15 Sobic.005G086600. 1 Э 1125 01:02 375 Methyl transferase-2 40.754 Plasma Membrane 
>SbCOMT-16 Sobic.005G045600. 1 5 1116 01:02 372 Methyl transferase-2 39.886 Plasma Membrane 
>SbCOMT-17 Sobic.005G101900.1 5 1098 01:02 366 Methyl transferase-2 40.51 Cytoplasmic 
>SbCOMT-18 Sobic.005G216100.1 5 1092 01:02 364 Methyl transferase-2 38.627 Cytoplasmic 
>SbCOMT-19 Sobic.005G224400.1 5 1119 01:02 373 Methyl transferase-2 40.898 Cytoplasmic 
>SbCOMT-20 Sobic.005G107900.1 5 1101 01:02 367 Methyl transferase-2 39.939 Cytoplasmic 
>SbCOMT-21 Sobic.005G216200.1 5 1092 01:02 364 Methyl transferase-2 38.641 Cytoplasmic 
>SbCOMT-22 Sobic.005G224300.1 5 1119 01:02 373 Methyl transferase-2 41.077 Cytoplasmic 
»SbCOMT-23 Sobic.008G014000.1 8 1182 01:02 394 Methyl transferase-2 42.377 Plasma Membrane 
>SbCOMT-24 Sobic.008G013900.1 8 807 01:02 269 Methyl transferase-2 29.537 Chloroplast 
>SbCOMT-25 Sobic.004G083500.1 4 1098 01:02 366 Methyl transferase-2 39.589 Cytoplasmic 
>SbCOMT-26 Sobic.004G35 1400.1 4 1134 02:03 378 Methyl transferase-2 40.294 Chloroplast 
>SbCOMT-27 Sobic.004G083401.1 4 933 02:03 311 Methyl transferase-2 33.646 Chloroplast 
>SbCOMT-28 Sobic.004G34 1600.1 4 1176 01:02 392 Methyl transferase-2 41.801 Cytoplasmic 
»SbCOMT-29 Sobic.004G128400.1 4 1089 01:02 363 Methyl transferase-2 39.287 Cytoplasmic 
»SbCOMT-30 Sobic.004G341500.1 4 1209 00:01 403 Methyl transferase-2 43.573 Cytoplasmic 
>SbCOMT-31 Sobic.007G099400. 1 7 1092 01:02 364 Methyl transferase-2 40.032 Cytoplasmic 
>SbCOMT-32 Sobic.007G058600. 1 7 1110 03:04 370 Methyl transferase-2 40.347 Chloroplast 
»SbCOMT-33 Sobic.007G170500.1 7 1107 01:02 369 Methyl transferase-2 39.719 Cytoplasmic 
»SbCOMT-34 Sobic.007G058400.1 7 1131 03:04 377 Methyl transferase-2 40.846 Chloroplast 
»SbCOMT-35 Sobic.007G074800.1 7 1125 01:02 375 Methyl transferase-2 40.628 Chloroplast 
>SbCOMT-36 Sobic.007G047300. 1 7 1089 01:02 363 Methyl transferase-2 39.59 Cytoplasmic 
>SbCOMT-37 Sobic.007G058800. 1 7 1125 03:04 375 Methyl transferase-2 41.064 Chloroplast 
>SbCOMT-38 Sobic.007G059100.1 7 1134 03:04 378 Methyl transferase-2 41.492 Chloroplast 
>SbCOMT-39 Sobic.001G354400.1 1 1098 01:02 366 Methyl transferase-2 39.583 Mitochondrial 
>SbCOMT-40 Sobic.001G246700.1 1 852 00:01 284 Methyl transferase-2 33.125 Plasma Membrane 
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>SbCOMT-41 Sobic.001G246700.2 1 930 00:01 310 Methyl transferase-2 30.032 Plasma Membrane 
>SbCOMT-42 Sobic.001G354200.1 1 1116 01:02 372 Methyl transferase-2 40.284 Cytoplasmic 
>SbCOMT-43 Sobic.001G456650.1 1 300 03:04 100 Methyl transferase-2 39.602 Chloroplast 
»SbCOMT-44 Sobic.006G008000. 1 6 1125 01:02 375 Methyl transferase-2 40.857 Plasma Membrane 
»SbCOMT-45 Sobic.006G007900.1 6 1125 01:02 375 Methyl transferase-2 40.884 Plasma Membrane 
>SbCOMT-46 Sobic.002G079500. 1 2 1140 01:02 380 Methyl transferase-2 40.46 Chloroplast 
>SbCOMT-47 Sobic.002G079500.2 2 1032 02:03 344 Methyl transferase-2 36.41 Chloroplast 
>SbCOMT-48 Sobic.002G077700.1 2 1188 01:02 396 Methyl transferase-2 42.752 Chloroplast 
Table 2: SDCOMT protein parameters. 
Gene Name Protein length (А.А) | Protein Molecular Weight(kDa) pl GRAVY 
SbCOMT-1 375 41.366 5.46 0.133 
SbCOMT-2 429 47.265 5.51 0.036 
SbCOMT-3 383 42.007 5.08 0.012 
SbCOMT-4 384 42.131 5.21 0.026 
SbCOMT-5 384 41.293 5.6 0.22 
SbCOMT-6 363 39.438 5.42 0.15 
SbCOMT-7 363 39.438 5.42 0.15 
SbCOMT-8 237 25.973 4.65 0.248 
SbCOMT-9 433 47.186 5.3 0.168 
SbCOMT-10 362 38.919 5.45 0.204 
SbCOMT-11 356 38.689 5.43 0.175 
SbCOMT-12 360 39.136 5.8 0.231 
SbCOMT-13 380 40.75 5.56 0.132 
SbCOMT-14 230 25.578 5.46 -0.065 
SbCOMT-15 375 40.754 5.13 0.2 
SbCOMT-16 372 39.886 4.86 0.218 
SbCOMT-17 366 40.51 5.61 0.056 
SbCOMT-18 364 38.627 4.91 0.184 
SbCOMT-19 373 40.898 5.38 0.117 
SbCOMT-20 367 39.039 5.75 0.109 
SbCOMT-21 364 38.641 4.91 0.185 
SbCOMT-22 373 41.077 5.35 0.066 
SbCOMT-23 394 42.377 6 0.081 
SbCOMT-24 269 29.537 6.5 -0.065 
SbCOMT-25 366 39.589 5.45 0.139 
SbCOMT-26 378 40.294 5.32 -0.011 
SbCOMT-27 311 33.646 5.2 0.106 
SbCOMT-28 392 41.801 5 0.195 
SbCOMT-29 363 39.287 5.39 0.172 
SbCOMT-30 403 43.573 5.23 0.08 
SbCOMT-31 364 40.032 5.39 0.086 
SbCOMT-32 370 40.347 5.39 0.008 
SbCOMT-33 369 39.719 5.84 0.194 
SbCOMT-34 377 40.846 4.93 0.117 
SbCOMT-35 375 40.628 7.13 0.086 
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SbCOMT-36 363 39.59 5.46 -0.015 
SbCOMT-37 375 41.064 5.05 0.007 
SbCOMT-38 378 41.492 5.15 0.052 
SbCOMT-39 366 39.583 5.77 0.166 
SbCOMT-40 284 33.125 5.72 0.205 
SbCOMT-41 310 30.032 4.97 0.253 
SbCOMT-42 372 40.284 5.76 0.174 
SbCOMT-43 100 39.602 5.54 0.033 
SbCOMT-44 375 40.857 5.13 0.199 
SbCOMT-45 375 40.884 5.15 0.199 
SbCOMT-46 380 40.46 4.94 0.181 
SbCOMT-47 344 36.41 5.29 0.029 
SbCOMT-48 396 42.752 5.55 -0.03 
pI, isoelectric point, GRAVY, Grand average of hydropathicity index 
Table 3: Non-synonymous and synonymous substitution rates of sorghum COMT paralog genes 
Gene-1 Gene-2 Ka Ks Ka_Ks T(MYA) 
SbCOMTI SbCOMT2 0.093790365 | 0.279215 | 0.335907 | 7.148655862 
SbCOMT3 SbCOMT4 0.02093362 0.166845 | 0.125468 1.59555029 
SbCOMT28 SbCOMT30 0.099433025 | 0.390486 | 0.254639 | 7.578736631 
SbCOMT23 SbCOMT24 0.114992761 | 0.302893 | 0.379648 | 8.764692129 
SbCOMTS SbCOMTI2 0.01094577 0.081252 | 0.134714 | 0.834281284 
SbCOMTI9 SbCOMT22 0.065428074 | 0.459634 | 0.142348 4.98689589 
SbCOMT14 SbCOMT20 0.139548407 | 0.696896 | 0.200243 10.63631153 
SbCOMT32 SbCOMT34 0.066990967 | 0.080131 | 0.836019 | 5.106018846 
SbCOMTIS SbCOMT21 0.001836361 0.03202 | 0.057351 | 0.139966528 


calculated based оп T=Ks/2x where x is 6.56x10? formula. 


Table 4: COMT genes in the four different genomes sequenced 


Ks-synonymous substitution; Ka-non-synonymous substitution; T(MY A)-Evolution time in Million years ago. Time 


Species Total no of protein Predicted no of Genome size Reference 
coding genes COMT genes 
Sorghum bicolor v3 34129 48 730Mb Phytozome 
Arabidopsis thaliana 27416 17 135Mb Ensembl 
TAIRIO 
Oryza sativa v7 42189 39 500Mb Ensembl 
Zea mays v4 39498 32 2.13Gb Phytozome 
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Table 5: S.no Name Sequence Len| Tm GC% 
used for SbPAL3-FP 5-GGICTTGTCCGCTCCCTGAAC-3 |21 | 62.96 | 61.90 
1  [SbPAL3-RP 5-TCGCGCCCTGGATCTICAC3 |19 | 62.37 | 63.16 
SbPAL8-FP 5-CTCGTCTCCGCCAGGAAGA3 |19 | 61.05 | 63.16 

2 | SbPAL8-RP 5-GACGGGTTCATGGTCAGCAC3 | 20 | 61.30 | 60.00 
SbC4H2-FP 5- AACCTGATGTCCCTCGCCAA-3 | 20 | 61.49 | 55.00 

3 | 55С4Н2-ВР 5-GGCCTTTCCCCGTGAAGATG-3 |20 | 61.03 | 60.00 
Sb4CL4-FP 5- TGCAGACCTACTGCTTCGGG-3 |20 | 61.39 | 60.00 

4 | Sb4CL4-RP 5-AGTTGCGGAGCAGGTTCATC-3 |20 | 60.67 | 55.00 
SbHCT2-FP 5-GACGACTACGGTGACTICGC-3 | 20 | 60.79 | 60.00 

5 | SbHCT2-RP 5-CCAGACATGCCATCCGCTAC-3 |20 | 60.88 | 60.00 
SbC3HI-FP 5- GGAGCACGCAAAGTCTCTCA-3 |20 | 6032 | 55.00 

6 | SbC3HI-RP S-TCTGCCATTGCCCACTCAAC-3 |20 | 6090 | 55.00 
SbCCoAOMT3-FP | 5-CAGTGGGGGTTCATGCAGTC- |20 | 6096 | 60.00 

7 |SbCCoAOMT3-RP | 5-TACTCCCTGCTCACGTCGAA-3 |20 | 606 | 55.00 
8 | SbCCoAOMTI-FP | 5-CGGAGGACGGCACGATCT3 18 |60.26 58.00 
SbCCoAOMTI-RP | 5- CGAAGTCGAACGACCCGTG-3 |19 |5930 | 59.69 
SbCCRI-FP 5- GACCTGGGATTGGAGTTCCG3 |20 | олі | 6000 

9 | SbCCRI-RP 5- CACGCACGGATGGCGATT3 18 | 61.20 61 
SbFSHI-FP 5- CATGGACGTGATGTTTGGCG-3 |20 | 6018 | 55.00 

10 | SbFSHI-RP 5-TGAGGAAGGGGAGCTTGTCC-3 |20 | 6120 | 6000 
SbCAD2-FP 5- CGTCCGAGAGGAAGGTGGTC-3 | 20 | 61.94 | 65.00 

П | SbCAD2-RP 5-GGGTACTTTGAAGCCCCGAG-3 | 20 | возо | 60.00 


Tools & Database used in this study: 


PCR study 


1. MEGA v 7.0 (http://www.megasoftware.net) 
TBtools (https://github.com/CJ-Chen/TBtools/releases) 


Phytozome (http://www.phytozome.net/) 


2 
3. Wolfpsort (https://wolfpsort.hgc.jp) 
4 
5 


MEME (http://meme-suite.org/tools/meme) 


Protparam (https://web.expasy.org/protparam/) 


6 

7. SMART (http://smart.embl-heidelberg.de/) 

8. PFAM database (http://pfam.xfam.org/ search) 
9 


Ka/Ks calculator (http://services.cbu.uib.no/tools/kaks) 
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Primers 
q-RT 
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10. PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) 

11. SWISS-MODEL SERVER (https://swissmodel.expasy.org/) 

12. Protein structure verification server (SAVES v6.0) (https://saves.mbi.ucla.edu/). 

13. STRING database (https://string-db.org/). 

14. A Plant Small RNA Target Analysis Server (https://www.zhaolab.org/psRNA Target/) 


15. Cytoscape (https://cytoscape.org/). 
16. Plant transcription factor database (PTFDB) 


(http://planttfdb.gao-lab.org/) 


17. Gramene database (https://www.gramene.org/). 
18. MOROKOSHI Sorghum transcriptome database (http://sorghum.riken.jp). 
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d.Conserved domains 
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Fig 1: The evolutionary relationship, gene structure, and motif analysis of the 48 SbCOMTS from sorghum bicolor. a. The 
phylogenetic tree was constructed by MEGA v10.0 with the NJ method. b Structures of the 48 putative SoCOMT genes. c 
Motif distribution of SoCOMTs proteins, d Conserved domains, and e. Methyl transferase 2 domain. The different motifs are 
designated by different colours. 
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a. Phylogenetic tree Group-II 
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Fig 2:a. Phylogenetic tree representing the evolutionary relationship of COMTs from Sorghum bicolor, Oryza sativa, Zea 


mays, and Arabidopsis thaliana. and b. Physical mapping of sorghum COMT gene homologs. The 15 Paralog gene pairs are 
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b. Synteny analysis of COMT genes between Sorghum and Maize 


Fig 3: а. Synteny analysis of SDBCOMT genes between Sorghum and Oryza sativa and b. Sorghum and Zea mays. Gray lines 
in the background indicate the collinear blocks within sorghum and other plant genomes. The pink colour lines represent 


COMTS with collinearity in different genomes. 
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Fig 4: Predicted cis-regulatory elements in the promoter regions of Sb COMT genes 
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Fig 5:a. Structural analysis of 48 modelled sorghum bicolor SDCOMT proteins. b. String analysis of sorghum COMT. The SoCOMT 
protein exhibited interaction with various lignin biosynthetic pathways and secondary metabolite partners.SoCOMT 
(Sb07g003860.1), CAD (Sb04g005950.1), Phenylalanine ammonia-lyase (Sb04g026510.1, Sb04g026520.1, Sb01g014020.1, 
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b. 


Ө Gene 


@ ТЕРату 


Fig 6: а. miRNA prediction of SDCOMT genes. The blue colour indicates predicted miRNA targets and the yellow colour 
boxes represent candidate 5ЬСОМТ homologs; b. Transcription factor prediction analysis of SoCOMT homologs. NAC, 
MYB, and MYB-related TFs are indicated in blue colour. The network is built with Cytoscape. 
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Fig 7: In silico expression analysis of ЗЬСОМТ; a. different regions of stem internodes, b. vascular and non-vascular 
system, c. baseline expression patterns in various tissues and organs, d. expression patterns under drought stress conditions 
and e. expression patterns under ABA, PEG, and NaOH stress conditions. 
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Fig 8: Heat map representing the expression patterns (Fold change) of SoCOMT genes іп S. bicolor analyzed by qRT PCR. 
CL-control leaf, CS-control stem, CR-control root, DL-drought leaf, DS-drought stem, DR-drought root, SL-salt leaf, SS-salt 
stem, and SR-salt root. (р=0.05) 
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